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ABSTRACT 

This paper presents a process able to mark digital pictures with an invisible and undetectable secrete infor- 
mation, called the watermark. This process can be the basis of a complete copyright protection system. 
The process first step consists in producing a secrete image . The first part of the secret resides in a basic infor- 
mation that forms a binary image. That picture is then frequency modulated. The second part of the secret is 
precisely the frequencies of the carriers. Both secrets depends on the identity of the copyright owner and on the 
original picture contents. The obtained picture is called the stamp. 

The second step consists in modulating the ampitude of the stamp according to a masking criterion stemming 
from a model of human perception. That too theoretical criterion is corrected by means of morphological tools 
helping to locate in the picture the places where the criterion is supposed not to match. 

This is followed by the adaptation of the level of the stamp at that places. The so formed watermark is then 
added to the original to ensure its protection. 

That watermarking method allows the detection of watermarked pictures in a stream of digital images, only with 
the knowledge of the picture owner's secrets. 

Keywords: copyright protection, watermark, secrete key, masking, human vision model, perceptive compo- 
nents, morphology, robustness, detection, correlation. 



1 GENERAL INTRODUCTION 

With the increasing availability of digitally stored information and the development of new multimedia services, 
security questions are becoming even more urgent. The acceptance of new services depends on whether suitable 
techniques for the protection of the work providers' interests are available. 1 

Moreover the nature of digital media threatens its own viability: 

• First the replication of digital works is very easy and, what is more dangerous, really perfect. The copy is 
identical to the original. 



• The ease of transmission and multiple uses is very worrying, too. Once a single pirate copy has been made, 
it is instantaneously accessible to anyone who wants it, without any control of the original picture owner. 

♦ Eventually the plasticity of digital media is a great menace. Any malevolent user (a pirate) can modify an 
image at will. Such maniplations are really easy for a pirate and put many copyright protection methods 
at risk. 

According to these considerations the conception of a copyright protection system is really vital and it consti- 
tutes a great challenge, because it should cope with all these threats. Without watermarking, most authors will 
not dare to broadcast their work. 

This paper presents an additive watermarking technique. It consits in producing a synthetic picture (also 
called the stamp) which holds informations about the ownership of the original image and depends on the picture 
contents. That stamp is added to the original in a way that resulting picture is perceptually identical to the 
original one and so that the stamp is indetectable by a pirate computer. The aim of that technique is not the 
authentication of the picture content nor the identication of the owner. It is to allow a controller (i.e. the owner's 
computer or a Trusted Third Part) to find out watermarked pictures in a stream of images with the knowledge 
of the owner's secret key in order to detect broadcast of illegal copies. 

The most interesting part of that method is the embedding process i.e. the weighting of each pixels of the 
stamp before adding it to the original. This is based on the masking concept coming from a model of human vision 
(the perceptive model). From this concept was deduced a method which reveals itself actually efficient. Another 
interesting part is the presentation of two methods used for the detection of watermarked pictures without the 
original. This last point is fundemental for the management of the copyright protection. 
Eventually this paper ends with the analyse of the results and the system robustness. 



2 THE MASKING 

2.1 Introduction 

The aim of a watermarking technique is to provide an invisible embedding of a secrete information, the 
watermark. This watermark must be masked (hidden) by the picture it is inlayed in. Precisely a master thesis has 
lead to a masking criterion deduced from physiological and psychophysic studies. 2 Nevertheless, this theoretical 
criterion having been formulated for monochromatic signals, it had to be adaptated to suit real images. 

2.2 The perceptive model: approximation of the eye functionment 

It is now admitted that the retina of the eye splits an image in several components. These components circu- 
late from the eye to the cortex by different tuned channels, one channel being tuned to one component. 

The characteristics of one component are: 

• the location in the visual field (in the image). 

• the spatial frequency (in the Fourier domain: the amplitude in polar coordinates). 

• the orientation (in the Fourier domain: the phase in polar coordinates) 



So, one perceptive channel can only be excited by one component of a signal whose characteristics are tuned to 
its. Components that have different characteristics are independent. 



2-3 The masking concept 

According to perceptive model of human vision, 3 signals that have same (near) components take the same 
channels from the eye to the cortex. It appears that such signals interact and are submitted to non-linear effects. 
The masking is one of those effects. 

Definition: the detection threshold is the minimum level below which a signal can not be seen. 

Definition: the masking occurs when the detection threshold is increased because of the presence of another 
signal. 

In other words, there is masking when a signal can not be seen because of another with near characteristics 
and at a higher level. 



2.4 The masking model 

With the object of modalizing the masking phenomenon, tests have been made on monochromatic signals, 
also called gratings. It appears that the eye is sensitive to the contrast of those gratings. This contrast is defined 



where L is the luminance. 

It is possible to determine experimentally the detection threshold of one signal of contrast C s with respect to the 
contrast C m of the masking signal. That threshold can be modalized as follows: 



by: 
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Such bilogarithmic curves are traced for signals of one single frequency and one orientation (/o, 0q). 
The expression of the detection threshold is thus: 



C, = max[C< h Co&y] 
where e (the slope) depends on (/o, #o)> typically, 0.6 < e < 1.1. 



(2) 



It is possible to extend that expression to introduce frequency dependence. The general expression of the 
detection threshold is becomes: 



C,(C m ,f,Q) = Co + k iMo) (f,0)[C, iMo) (C m )-C o ) 
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In that expression, /o and #o are relevant to the masking signal, / and 0 are relevant to the masked signal, 
F(/o) and G(/o) are parameters that represent the spreading of the Gaussian function, Co is often negligable. 
The spread of the gaussian function depends upon the frequency /o: For frequency, typical bandwith at half 
response are 2,5 octaves at 1 c/d and 1,5 octaves at 16 c/d with a linear decrease between both frequencies. 4 For 
orientation, half bandwith at half response depends on fo and it takes typical values like 30 degrees at 1 c/d and 
15 degrees at 16 c/d. 5 

After this expression, the frequency dependence of the detection threshold has a Gaussian form. Only near 
frequency signals can interact. When the frequency of the masking signal (the mask) is far from this of the signal 
to mask, the detection threshold is almost equal to CV 



2.5 The masking criterion 

It is important to notice that those results concern only gratings signals. To deduce a masking criterion that 
will apply to signals like real images, the preceding masking condition has to be adaptated. So, it is necessary 
to define a new concept able to take the place of the contrast, because the contrast is not define for real images. 
That new concept, 2 is the local energy. 

The local energy is defined on narrowband signals centered around one frequency and one orientation. A 
picture which is a broadband signal is first filtered by Gabor narrowband filters, whose characteristics are near to 
human perception. The local energy around one frequency and one orientation is calculated following the scheme 
presented in this figure: 



The masking criterion: If the local energy of one picture is less than the local energy of the mask, around 
all the frequencies (/o,0o) and for each pixel (x, y), then one can say that the picture is masked by the mask. 
Strictly, a picture is masked by a mask if V(x,t/) and V(/o, #o), ^mask } (j 0 M( x ^ y) ~ Epicture,{fo,e 0 )( x >y)- For real 
images, a good approximation of this criterion can be obtained by using a bank of filters whose central frequencies 
correspond to independent components and which are spread on all the Fourier space. It is admitted that 4 or 
5 frequencies and 4 to 9 orientations are sufficient. The standard choice is twenty filters (5 frequencies and 4 
orientations). 
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Figure 1: Example of basic information used 



2.6 Conclusion 

This section has lead to the expression of an easily implement able masking criterion appliable to any image. 
But this criterion is only an extension of a theoretic criterion appliable to monochromatic signals. Thus cases 
where that criterion does not match are possible. 



This information is a binary picture looking like a modified checkerboard (figure 1). As explained later, the 
pixels value of the square forming that picture can correspond to a binary sequence deduced from the copyright 
owner's (CO) secrete key. 



3.2 The stamp 

In order to take advantage of the eye behaviour, the basic information is modulated at different frequencies 
and orientations corresponding to rather independent components. Moreover, we take care to filter the initial 
checkerboard with a low pass filter (LPF) (i.e. a Butterworth LPF) so that the resulting signal is bandlimited. 
This point is very important because it permits to limit the verification of the masking criterion in the corespond- 
ing channel. 

The position of the modulating carriers is secret. It can be deduced from CO's secret key. In practice, the 
frequency plan is divided into sectors. Each sector is relevant to one perceptive component and defined a group 
of couples (/, 0) where basic information can be modulated. Only one couple is chosen for each sector (because 
couples of a same sector don't stimulate independent components). The picture obtained from the sum of each 
modulated grid is called the stamp S(x, y). 



3 PRINCIPLE OF THE SYSTEM 



3.1 Basic information of the watermark 




(5) 



K represents the set of sectors and (f Xj ,fyj) correspond to the couple chosen in sector j ( this couple is designed 
by the CO's secrete key). 



3.3 The position of the process in a global copyright scheme 



The process should be placed in a copyright protection scheme like drawn at figure 2. 
The skeletization function consists in an image processing program extracting essential characteristics from an 
image. The result is a bitsteam. This must be followed by a hash-function 6 whose result is a succession of blocks 
of bits. Every block has the same length. The skeletization function gives the same result for two near images 
(i.e. original image and watermarked image). But the H-function always gives different results from different 
bitstreams as inputs. So, the inscription keys will be different for perceptually distinct pictures. After the H- 
function, the ciphering function is a trapdoor function. 6 Thanks to this function the inscription keys used to 
deduce the basic grid and the position of the carriers depends on the CO ? s secret key. 

The aim of the use of a trapdoor function is to prevent someone from reproducing the same inscription keys with 
the knowledge of the H-function result. But it is possible for anyone to inverse that trapdoor function and to find 
the H-function result from the inscription keys. It can be interesting in a proof procedure. 



4 IMPLEMENTATION 

4.1 Inscription 

The purpose of the inscription is to adapt the level of each part of the stamp ( for all frequencies ) to make it 
invisible once added to the picture. As mentioned above, each part of the stamp is narrow band. Inscriptions at 
different frequencies are thus independent and one can treat the different components of the stamp one at a time. 
For each frequency designed by the inscription keys, the procedure is divided in three steps : the modulation, the 
regulation of the level and the correction. 

• Modulation 

The first step consists in the modulation of the particular carrier by the lowpass grid G(x, y). The result is 
G(z,y).cos(f Xj .x + f yj y), where f Xj and f y . are the carrier position. 

♦ Regulation of the level 

According to the perceptual model, in order to guarantee the invisibility of the watermark its local energy 
has to be inferior to the picture local energy for each pixel around the inscription frequency. A way to 
reach this objective is to multiplicate the modulated grid by a weighting mask Weight j (x, y) reducing 
the amplitude of the stamp where energy in the correponding component of the original picture is weak. 
Nevertheless, one must take care to keep the narrow band characteristic of the resulting signal Sj(x,y) (= 
Weightj(x,y).G(x, y).co$(f rj .x + f yj .y)) in order to avoid non linear interactions between different parts of 
the stamp. In conclusion, V7", we have to find a signal Weight j(x,y) so that: 

- V(x,y) E Sj {x,y) < E WmjJmj) (x t y) 

- Sj is narrow band 

For simplification, lets consider W eight j(x,y) be composed of two factors: 

- cxj , a constant factor (fixing the global level of the stamp) . 

- Mj(x,y) } a mask whose values £ [0, 1]. 

When ctj is chosen, the way to find Mj(x,y) so that Weight j(x, y) satisfy the conditions defined above is 
the following: 
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Figure 2: Global scheme for copyright protection. 



- Firstly, Mj (x, y) is a binary mask. Mj (x, y) = 1 when the local energy of the stamp pemits the masking 
and Mj{x, y) = 0 when the local energy of the stamp is too important. It is obvious that the initial 
choice of aj has a direct influence on Mj{x,y). Indeed, a great aj value will lead to put most of the 
Mj{x, y) values to zero, while a small aj value will lead to keep most of Mj(x, y) values at one. 

- Secondly, Weight j(x, y) is filtered so that the stamp remains narrow band. 

- After this second step, one has found a signal aj.Mj(x, y).G(x, y) which is better masked than 
aj.G(x,y). In order to really satisfy the masking criterion V(x,y), this procedure must be repeated 
iteratively, taking Mj(x,y).G(x,y) as new G(x,y). Experiments have shown that only two iterations 
are sufficient to have a result satisfying the masking criterion everywhere. 

One important question remains: how to choose ocj? , 

It has already been said that the more aj increases, the more Mj(x,y) has points equal to zero. A trade off 
has to be found by means of a defined criterion. Maximizing the correlation at the detection (by maximiz- 
ing ^2 aj .Mj (x , y) .G(x , y)) could have been a good criterion, but such a criterion often tends to impose an 
optimum with a lot of points equal to zero and a small number of points with a great value. The addition 
of the so obtained watermark generally entails a degradation of the picture quality. This emphasizes the 
lack of the masking criterion used. 

As mentioned in section 2.6, the invisibility criterion used here is an extension for real images. It appears 
that this extension entails some imperfections. This criterion being insufficient, some improvements have 
been brought thanks to experimental results. 

The conclusion of these observations is that the invisibility is only strictly observed in high activity re- 
gions, where the local energy of high fequencies is important. These regions have to be favoured during the 
inscription in the sense that the level of the watermark will be increased in those regions while it has to be 
decreased in other regions. 

The correction process first isolates the high activity regions (figure 3. a). Then, an homogeneization of this 
picture is performed by use of morphological tools, e.g. one opening and one closing (figure 3.b). After 
a leveling (in fact, a division by the mean or mean square value of the homogenized mask), we obtain a 
new mask used to multiply the picture local energy and so, giving an advantage to regions of highfrequency 
energy in comparison with other areas. After that correction, the process is identical to the one described 
previously. Moreover, the complexity is not increased. Indeed, we first work on the inscription at high 
frequencies (where there is no quality problems). The value of high frequency local energy is then used for 
the calculation of the correcting mask used for inscription at lower frequencies. The correction scheme is 
drawn in the following schema. 



£ HF energy » opening ^ closing „ leveling. ^shaping MASK 



4.2 Detection 

The aim is to detect if a watermark has been embedded. This can be done with the use of a correlation, but 
first it is necessary to isolate the watermark and then to demodulate it in order to reconstruct something that is 
highly correlated with the basic information (the grid). 
The formulation of the watermark is: 



y) = A 3 C0S {fx r x + fyj-y) 



(6) 



(a) <"> 

Figure 3: Correcting mask for Lena: (a) Areas of high frequencies, (b) Morphological homogeneization of the 
mask. 



whereAj = aj.G(x, y).M(x J y) (7) 

In this expression, M(x } y) adjusts the level of the grid in order it becomes invisible, it is called a mask, and its 
maximal value is one. 

ccj is a constant that used to normalize the mask, it must be as high as possible. 



The detection is divided in three steps : teh demodulation, the correlation and the decision. 

• Demodulation 

I W (x, y) = Y, A o cos (f*j x + hi V) + J o + N(x t y) (8) 

where I w {x,y) is the watermarked picture, I 0 {x : y) is the original picture and N(x y y) is an additive noise 
from the channel. 

The demodulation consists in multiplying Iw by cos(f Xj .x-\- / yj .y),Vj G K and then to filter with a LP 
filter. 

The result will be : 

D j (x,y) = -.A j (x i y) + N*(x i y) (9) 

N*(x,y) depends on the image and on the additive noise. The other parts of the stamp will be eliminated 
by the LP filter. 

• Correlation It consists in mutiplying the demodulated information D(x, y) = J2jeK Dj( x > V) w ^ tn tne basic 
grid G(x,y). If the picture has not been too deteriorated, D{x ) y) and G(x,y) should be similar. 



(10) 



= E Q i D G2 (*> y)- M i( x > y) + G (*< f)l 



(ii) 



j£K x,y 



In 11, the first term is even greater than the second, because G and N* have null average values. 

So C exclusively depends on the watermark value. 

in the case the grid is not the good one, the correlation gives: 



c* = E «i £ G (*> y)-°*^ y)- M A*>y) 



(12) 



jeK x,y 

C* C if the choice of the basic information has been appropriate. 



• decision 



The detection algorithm performs demodulations and correlations at diverse frequencies and with diverse 
grids. The decision is made after the comparison of these correlations. 



The first and probably mosty important result is the invisibility of the stam in all images that were tested. 
Figure 4. a and b compares the original and stamped picture for Lena. In figure 4.e, omne observes the watermark 
that was added to the original picture. 

Two methods were used to determine whether an image is watermarked or not. The first one consists in 
comparing the result of C the correlation made with the right grid G(x,y) from the right key with C* the 
correlation made with G*(a;,t/), the grid obtained by random keys see 12. If the picture is watermarked, the 
correlation with the right key is even greater than the random correlations. The results below (Figure 5) show 
the pertinence of this method. 

The second method uses a grid G(x,y) formed from a MLS sequence, having good correlation properties. 
Correlations are made with shifted versions of the basic grid. Due to these good correlation properties, the 
correlation with the the right grid gives a result even greater than the correlations with shifted grids. Results are 
presented below (figure 4.c and d), if a picture is watermarked, a pick appears in the center. 



Many tests have been performed concerning usual pictures deteriorations in image processing like blurring and 
compression. The inspection of these results are quite satisfying, but expected due to the frequency approach. 
For all classical pirate attacks like zoom, cropping, overwatermarking it is not as simple. The overwatermarking 
makes no problem, the presence of the watermark is still detected. But for zoom and cropping, the remaing point 
is to find a few tools permitting to complete the process. The concept of these tools is already denned but yet no 
implementation has been acheived. 7 



5 RESULTS 



6 SYSTEM ROBUSTNESS 



7 CONCLUSION 



The process developed here allows the watermarking of the ownership of any picture. The perceptual approach 
used here is probably the best one, that is why the results obtained are so satifying compared with other methods 
and this method is so performant. Nevertheless studies are still running to acheive a new goal, consisting in 




Figure 4: Results for Lena: (a) Original, (b) Watermarked one, (c) Correlation grafic for original, (d) Correlation 
grafic for watermarked, (e) Watermark. 
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Figure 5: Results of correlation for Lena and decision. 



making more information (e.g. ownership, date of marking) readable by the key owner from the watermark. This 
could be useful for real copyright protection protocols 8 . 9 
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